RAW SEQUENCE LISTING 
ERROR REPORT 


- BIOTECHNOLOGY <S5 
SYSTEMS af 



The Biotechnology Systems Branch of the Scientific and Technical Information Center 
(STIC) detected errors when processing the following CRF diskette: 


Date Processed by STIC: 


THE ATTACHED PRINTOUT EXPLAINS THE ERRORS DETECTED. 

PLEASE BE SURE TO FORWARD THIS INFORMATION TO THE APPLICANTS 
BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT 
COMMUNICATION TO THE APPLICANTS ALONG WITH A NOTICE TO 
COMPLY or, 

2) CALLING APPLICANTS AND FAXING THEM A COPY OF. THE PRINTOUT 
WITH A NOTICE TO COMPLY 

THIS WILL INSURE THAT THE NEXT SUBMISSION RECEIVED FROM THEM 
WILL BE ERROR FREE. 

IF YOU HAVE ANY FURTHER QUESTIONS, PLEASE CALL: 
ARTI SHAH 703-308-4212 


Art Unit / Team No. : 


Application Serial Number: 



Raw Sequence Listing Error Summary 


ERROR DETECTED SUGGESTED CORRECTION 


SERIAL NUMBER: 


ATTN: NEW RULES CASES: PLEASE DISREGARD ENGLISH "ALPHA" HEADERS. WHICH WERE INSERTED BY PTO SOFTWARE 


1 


Wrapped Nudefcs 


Wrapped Aminos 


Incorrect Line Length 


Misaligned Amino Add 
Numbering 

Non-ASCII 


Variable Length 


Wrong Designation 


Skipped Sequences 
(OLD RULES) 


Skipped Sequences 
(NEW RULES) 


The numberAext at the end of each line Vrapped" down to the next line. 
This may occur if your file was retrieved in a word processor after creating H. 
Please adjust your right margin to .3. as this will prevent Vrapping". 

The amino acid numberAext at the end of each line \vrapped " down to the next line. 
This may occur if your file was retrieved in a word processor after creating it. 
Please adjust your right margin to .3. as this win prevent •Wrapping". 

The rules require that a line not exceed 72 characters in length. This includes spaces. 
All text must be visible on page. 

The numbering under each 5lh amino acid is misaligned. This may be caused by the use of tabs 
between the numbering. It is recommended to delete any tabs and uses spacing between the numbers. 

This Tile was hot saved In ASCII (DOS) text, as required by the Sequence Rules. 

Please ensure your subsequent submission is saved In ASCII text so that H can be processed. 

Sequence(s) contain n*s or Xaa's which represented more than one residue. 

As per the rules, each n orXaa can only represent a single residue. 
Please present the maximum number of each residue having variable length and 
indicate in the (ix) features section that some may be missing. 

Sequence(s) contain amino acid or nucleic acid designators which are not standard 

representations as per the Sequence Rules (Please refer to paragraph 1.822) 

Sequence(s) missing. If intentional, please use the following format for each skipped sequence: 

(2) INFORMATION FOR SEQ ID NO:X: 

(I) SEQUENCE CHARACTERISTICS:(Do not insert any headings under "SEQUENCE CHARACTERISTICS") 
(xl) SEQUENCE DESCRIPTION:SEQ ID NO:X: 
This sequence Is Intentionally skipped 

Please also adjust the "(Hi) NUMBER OF SEQUENCES:" response to include the skipped sequence(s). 

Sequence(s) missing. If Intentional, please use the following format for each skipped sequence. 

<210> sequence Id number 
<400> sequence Id number 
000 


Use of N*s or Xara's 
(NEW RULES) 


Use of N's and/or Xaa's have been detected In the Sequence Listing. 
Use of <220> to <223> Is MANDATORY if n's or Xaa's are present. 


Use of <213>Organlsm 
(NEW RULES) 

Use of <220>Feature 
(NEW RULES) 


Sequence(s) . 


are missing this mandatory field or its response. 


Sequence(s) are missing the <220>Feature and associated headings. 

Use of <220> to <223> Is MANDATORY if <213>ORGANISM is "Artifidar or "Unknown" 

(See "Federal Register," 6/01/98, Vol. 63, No. 104. pp. 29631-32) 
(Sec. 1.823 of new Sequence Rules) 


Wrong F rmaV File submitted was In the alphabetical heading format of the Old Sequence Rules.This is Invalid since the 

"Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Disclosures" 
Federal Register Notice. Vol. 63. No. 104. June 1. 1998. p. 29620 
applies to applications filed on or after July 1. 1998. 

AKS-Biotechnology Systems Branch- 7/10/98 


j ( i ) 4BPli£g^— SNOW B BANDMIL' 
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j( ii t )(ffumber of Sequence^: 4 
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enqtg of -ooquoncc : 


PRODUCTS CO . , . 

DNA and process for preparing protein using the/DN 


-> 


1316 
nucleic acid 


C) t r an( 3 edne's sp double 
(0 Topoipavh linea r 
(ji) ^olecul^ )(€y£^ genomic DNA 


Ov) CORRESPONDENCE ADDRESS: 

(A) ADDRESS! 

(B) STREET: 
CO CITY: 

CD) STATE 

CE) COUNTRY: 
CBZIP: 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE; 
<B) COMPUTER: 
CO OPERATING SYSTEM. 
CD) SOFTWARE: 

M) CURRENT APPLICATION DATA 
CA) APPLICATION NUMBER: 

(B) FILING DATE: 
(O CLASSIFICATION: 


ic DNA- 1 ) 


CTGGAGACAT ATAACTTGAA CACTTGGCCC TGATGGGGAA GCAGCTCTGC AGGGACTTTT 60 
TCAGCCATCT GTAAACAATT TCAGTGGCAA CCCGCGAACT GTAATCCATG AATGGGACCA 120 
CACTTTACAA GTCATCAAGT CTAACTTCTA GACCAGGGAA TTAATGGGGG AGACAGCGAA 180 
CCCTAGAGCA AAGTGCCAAA CTTCTGTCGA TAGCTTGAGG CTAGTGGAAA GACCTCGAGG 240 
AGGCTACTCC AGAAGTTCAG CGCGTAGGAA GCTCCGATAC CAATAGCCCT TTGATGATGG 300 
TGGGGTTGGT GAAGGGAACA GTGCTCCGCA AGGTTATCCC TGCCCCAGGC AGTCCAATTT 360 
TCACTCTGCA GATTCTCTCT GGCTCTAACT ACCCCAGATA ACAAGGAGTG AATGCAGAAT 420 
AGCACGGGCT TTAGGGCCAA TCAGACATTA GTTAGAAAAA TTCCTACTAC ATGGTTTATG 480 
TAAACTTGAA GATGAATGAT TGCGAACTCC CCGAAAAGGG CTCAGACAAT GCCATGCATA 540 
AAGAGGGGCC CTGTAATTTG AGGTTTCAGA ACCCGAAGTG AAGGGGTCAG GCAGCCGGGT 600 
ACGGCGGAAA CTCACAGCTT TCGCCCAGCG AGAGGACAAA GGTCTGGGAC ACACTCCAAC 660 
TGCGTCCGGA TCTTGGCTGG ATCGGACTCT CAGGGTGGAG GAGACACAAG CACAGCAGCT 7 20 
GCCCAGCGTG TGCCCAGCCC TCCCACCGCT GGTCCCGGCT GCCAGGAGGC TGGCCGCTGG 7 80 
CGGGAAGGGG CCGGGAAACC TCAGAGCCCC GCGGAGACAG CAGCCGCCTT GTTCCTCAGC 840 
CCGGTGGCTT TTTTTTCCCC TGCTCTCCCA GGGGACAGAC ACCACCGCCC CACCCCTCAC 900 
GCCCCACCTC CCTGGGGGAT CCTTTCCGCC CCAGCCCTGA AAGCGTTAAT CCTGGAGCTT 960 
TCTGCACACC CCCCGACCGC TCCCGCCCAA GCTTCCTAAA AAAGAAAGGT GCAAAGTTTG 1020 
GTCCAGGATA GAAAAATGAC TGATCAAAGG CAGGCGATAC TTCCTGTTGC CGGGACGCTA 1080 
TATATAACGT GATGAGCGCA CGGGCTGCGG AGACGCACCG GAGCGCTCGC CCAGCCGCC^J. 1 4 0 

rrr r^ isisr rr rcvr ^r r . TTT ncr r r^rT ^ ™ atg aac aag ttg ctg tgc TGcn|@]Li9 3 



gcg ctc gtg gtaagtccct gggccagccg acgggtgccc ggcgcctggg( 

Ala Leu Val 



!@I@1@L242 


GAGGCTGCTG mrrraavr TrfPAACCTC CCAGCGGACC GGCGGGGAGA AGGCTCCACT 1302 


CGCTCCCTCC CAG( Q@lQl@l@M@i@l(§i@|(ai(ai@ l@l@|@|@| @|@iQi©|@i(al@|@^ 316 . Sjj^X 


(Q KLenqtj yo f aequenee : 9898 

i ^quoncc» (^yjjie : nucleic acid 
(^y^ trandednes ^3 double 
.inear 

genomic DNA (human OCIF genomic DNA- 2) 

GCTTACTTTG TGCCAAATCT CATTAGGCTT AAGGTAATAC AGGACTTTGA GTCAAATGAT 
ACTGTTGCAC ATAAGAACAA ACCTATTTTC ATGCTAAGAT GATGCCACTG TGTTCCTTTC 
£C£££CTAe^TT CTG GAC ATC TCC ATT AAG TGG ACC ACC CAG GAA ACG TTT 
leleltaleggplnp^ r.pn J\sn T.lft Spy T1p. t.y^Trp Thr T^r Hi n r,in Thr Phe 


[Q\ ^Q £Qlog y^ linea ; 


60 
120 
171 


CCT CCA AAG TAC CTT CAT TAT GAC GAA GAA ACC TCT CAT CAG CTG TTG 
Pro Pro Lys Tyr Leu His Tyr Asp Glu Glu Thr Ser His Gin Leu Leu 




TGT GAC AAA TGT CCT CCT GGT ACC TAC CTA AAA CAA CAC TGT ACA GCA 
Cys Asp Lys Cys Pro Pro Gly Thr Tyr Leu Lys Gin His Cys Thr Ala 
20 25 30 35 


267 


AAG TGG AAG ACC GTG TGC GCC CCT TGC CCT GAC CAC TAC TAC ACA GAC| 
Lys Trp Lys Thr Val Cys Ala Pro Cys Pro Asp His Tyr Tyr Thr Asp 

40 45 50 


315 


AGC TGG CAC ACC AGT GAC GAG TGT CTA TAC TGC AGC CCC GTG TGC AAG 36 3 
Sej; Trp Hi^Thr Ser Asp Glu Gy&-Xeu JTyr Cys Ser Pro^Val Cys Lys 

al@l@p|J) 5 5 (| ggQlQ|@l@l@ |g |^6 5 

GAG CTG CAG TAC GTC AAG CAG GAG TGC AAT CGC ACC CAC AAC CGC GTG(1$@)411 
Glu Leu Gin Tyr Val Lys Gin Glu Cys Asn Arg Thr His Asn Arg Val 
70 75 80 

TGC GAA TGC AAG GAA GGG CGC TAC CTT GAG ATA GAG TTC TGC TTG AAA 4 59 
Cys Glu Cys Lys Glu Gly Arg Tyr Leu Glu lie Glu Phe Cys Leu Lys 
85 90 95 



CAT AGG AGC TGC CCT CCT GGA TTT GGA GTG GTG CAA GCT G GTACGTGTC^i@j 509 
His/rg Sftr f!ys frn Pro Gly PhR Gly Val V al Gin Ala 

l o o(jQl@l@l@lQl@l@||^ o 5Cj©iQlQlQlQl@l@i^L 1 o 


ATGTGCAGCA 
CACTTTTGTT 
TAGGTACTAT 
TACAGGGCAA 
ATGGTTTTTT 
ATACCTCTAT 
TCAGAAATGT 
GCTAACAATA 
TTTCATTATT 
GTAAGGACTA 
GTCAAGCCAA 
AGCATTGGTC 
TGCCACATTT 
GTATCCACTT 
CAGTGTTTCT 
ACTCCTTTTT 
TAGCCACTAG 
ACTGTCAAAT 
AAGTATCTGT 
CGACCAATAC 
TGTTTCTCAA 
CCCTTAAAAT 
CTATTGGATG 
GGGTGTGGAA 
GTAGAAAAAT 
AAATCACAAG 
GCAGCCAGAA 
GGGATTTATT 
AGGTTTCCAG 
ACCGTTTTGT 
TACTTCATTC 


AAATTAATTA 
CTGATGACAT 
GTGTCTGGAG 
TTTAATGACA 
TTTTTTTTTTT 
ATTTCACTTC 
TAATTTATAC 
AGCAGTTATA 
AAAAACAAGG 
TAGCAGAATC 
GAGCAAGCAC 
AGGGCTCATG 
GCGAAGCTTC 
ACTTAGATGG 
CAACTGAAGC 
GTGGGCAGCT 
ATACCAATAG 
GTCGCCAGGT 
AACTATTTTA 
TTTAGGTTTA 
ATAGTGAATC 
TCCTCTTCGT 
GTACTTTGAG 
TCCCATCAGA 
GAAAAGTGGG 
GATCTTTCTT 
GACTCAGAAC 
TACCTCTCCC 
CCCAAAGAGA 
TGTTGCTGTT 
TGTTAATTCC 


GGATCATGCA 
TATAGGATAG 
TGCTTCCAAA 
AATCTCAAAT 
TAAAGAAACA 
AGCATGGACA 
CAAGAGAGTA 
ATTAATTATG 
CTAGTTCTTC 
TCTTCAATGA 
TTGCCTATAA 
TGTATTGAAT 
AGTGCAGCCT 
AAGAAGTAAT 
CCTGCTGATA 
GTCCTGCGCA 
CAGTCCTTCC 
GGCAAAATCA 
ACTCTCAAAA 
AAGGCATACA 
TTATAAAATT 
ATGAGTATTT 
ACTCAAAAGC 
TAAAAGCAAA 
CTATGCAGCT 
AAATAAGTAA 
AAAAGTACAC 
TCCCTAAAAA 
AGGAAAGACT 
GCTGTTTTGA 
TGTGGAATTA 


AAGTCAGATA 
CAAATTGCAA 
GGACCATTGC 
GCAGCAAATT 
AACTCAAGTT 
CCTTCAAACT 
ATTATGCTCA 
TAAAAAATGA 
CTTTAGCATG 
GCTTATTCTT 
ACCAAGTGCT 
CTTTTAAACC 
ATAACTTTTC 
CAGTATAGAT 
TTTTAAGAAA 
TTGTAGAATT 
CCCATGTGAC 
CTCCTGGTTG 
CTTGTGATAT 
AATGAAACAT 
AATCACAGAA 
GAGGGAGGAA 
TAAGCTAAGT 
TCCATGTAAT 
TGGAAACTAG 
GAAAATCTGT 
ATTTTACTCT 
CCCACACAGC 
ATGTGGTGTT 
AATCAGATTG 
CTTAGAGCAA 


GTTGTGACAG 
AGGTAATGAA 
TCAGAGGAAT 
ATTCTCTCAT 
GCACTATTGA 
GCAGCACTTT 
TATTAATGAG 
GAATGGTGAG 
GGAGCTGAGT 
TATCTTAGAC 
TTCTCTTTTG 
AGTAACCCAC 
ATAGCTTGAG 
TCTGATGACT 
TATCTGGATT 
TTGGCAGCAC 
AGCCAAAAAT 
AGAACAGGGT 
ACAAAGTCTA 
TCAAAAATCA 
GATGCAAATT 
TTGGTGATAG 
TGTGTGTGTG 
TCATTCAGTA 
AGAATTTTGA 
TTGTAGAATG 
GTGTACACTG 
GGTTCCTCTT 
ACTCTAAAAA 
TCTCCTCTCC 
GCATGGTGAA 


TTTAGGAGAA 569 
ACCTGCCAGG 6 29 
ACTTTGCCAC 6 89 
GAGATGCATG 7 49 
TAGTTGATCT 809 
TTGACAAACA 869 
ACTCTGGAGT 929 
GGGAATTGCA 989 
GTTTGGGAGG 1049 
AAAACAGATT 1109 
CATTTTGAAC 1169 
GTTTTTTTTC 1229 
AAAATTAAGA 1289 
CAGTTTGAAG 1349 
CCTAGGCTGG 14 09 
CCCTGGACTC 1469 
GTCTTCAGAC 1529 
CATCAATGCT 1589 
AATTATTAGA 1649 
AAATCTATTC 1709 
GCATCAGAGT 17 69 
TTCCTACTTT 1829 
TCAGGGTGCG 1889 
AGTTGTATAT 1949 
AAAATAATGG 2009 
AAGCAAGCAG 2069 
GCAGCACAGT 2129 
GGGAAATAAG 2189 
GTATTTAATA 2249 
ATATTTTATT 23 09 
TTCTCAACTG 23 69 


TAAAGCCAAA 
TATTTCCACT 
AAAGTACCAT 
CTTCTGAATA 
TTTAAAAGCT 
AGGAGAAGAC 
ACGGTGGCTC 
GGTCAGGAGT 
AAAAATTAGC 
AGGAGAATCT 
CTCCAGCCTG 
AAAAAGATTC 
TGTCCAAGTC 
AAATACCTCT 
AGGATGTAGT 
AAGGTGGTTC 
TTTAGGCTGT 
GTCAATGAAT 
TTTTTTATTT 
GAAGTTTAAT 
GCCAGAATTG 
TTACACTAGA 
GTAATATAGT 
TTCAAGTTTT 
AAATGCCCTT 
AGCAAATGGT 
GTCAGCGGCC 
TCTCAAGGTT 
CAGTAGGAAC 
GATATTACAG 
ATGGCTGACA 
CTTCTTTCCT 
CATTTGCATT 
CTAATGAAGT 
CCAAGTGAAA 
GAGAAGGTAC 


1@1 


TTTCTCCATC 
GATAGTAATA 
CAGTTATAGA 
TATTATGAAA 
AACTTACCTA 
CCAAGCCACA 
ACATCTGTAA 
TCAAGACCAG 
AGGGCATGGT 
CTTGAACCCT 
GGTGACAGAG 
TTCTTCATGC 
ACTTATTTCG 
GCTTATGATA 
AGGAAAGTAC 
CTAAGATAAT 
GTTTTCCCCT 
CATGTAGAAA 
TCTGGTTTTG 
AAGTTTCTGT 
GCCTGTAAAA 
TGGAGATATT 
CAAGTGTTTG 
TCTGCCAATG 
GCAGTCACCC 
ATATCATCTT 
AACTTTATTG 
AGCATACTTA 
TGATTGGAAT 
CAGACACACA 
ACACGGCCTT 
TTCCTCTCAC 
ACAAGGAGGA 
GAAAAATGAA 
AGTCTTTCCA 
TAA ATTGCTT 


ATTATAATTT 
AGGTAAAATC 
GGGAAGTCAT 
CATTAGTTCT 
AAAGAAATAT 
GATATGTATC 
TCTCAAGAGT 
CCTGGCCAAC 
GGTGCATGCC 
CGAGGCGGAG 
ATGAGACTCC 
AGAACATACG 
AGTAAATTAG 
TTGTAGAATT 
TAAAAACAAA 
GTCAGTGCAA 
CCTGTTCTTT 
GAGACAGGAG 
GTAAAAGATA 
AGCTTTGATT 
TCTACATATG 
TTCATATTCA 
AAGGTATTTA 
ATTTCTTCAA 
TTCCTGAAGT 
CCGTTTACTA 
CCACCTTCAA 
GGAGTTGCTT 
TTAATGATGC 
GCAGTTATCT 
ACTGCCACTC 
ATTTCATGAG 
GAAACTGGCA 
AATGCTAGAG 
AAACTGTGTT 
GCTATTTTPP 


CACATTTTGC CTGGCAGGTT ATAATTTTTA 
ATTACTTAGA TGGATAGATC TTTTTCATAA 
GTTCATGTTC AGGAAGGTCA TTAGATAAAG 
GTCATTCTTA GATTCTTTTT GTTAAATAAC 
CTGACACATA TGAACTTCTC ATTAGGATGC 
TGAAGAATGA ACAAGATTCT TAGGCCCGGC 
TTGAGAGGTC AAGGCGGGCA GATCACCTGA 
ATGATGAAAC CCTGCCTCTA CTAAAAATAC 
TGCAACCCTA GCTACTCAGG AGGCTGAGAC 
GTTGTGGTGA GCTGAGATCC CTCTACTGCA 
GTCCCTGCCG CCGCCCCCGC CTTCCCCCCC 
GCAGTCAACA AAGGGAGACC TGGGTCCAGG 
CAATGAAAGA ATGCCATGGA ATCCCTGCCC 
TGATATAGAG TTGTATCCCA TTTAAGGAGT 
CACACAAACA GAAAACCCTC TTTGCTTTGT 
TGCTGGAAAT AATATTTAAT ATGTGAAGGT 
TTTTCTGCCA GCCCTTTGTC ATTTTTGCAG 
ATGAAACTAG AACCAGTCCA TTTTGCCCCT 
CAATGAGGTA GGAGGTTGAG ATTTATAAAT 
TTTCTCTTTC ATATTTGTTA TCTTGCATAA 
GATATTGAAG TCTAAATCTG TTCAACTAGC 
GATACACTGG AATGTATGAT CTAGCCATGC 
TTTTTAATAG CGTCTTTAGT TGTGGACTGG 
ATTTATCAAA TATTTTTCCA TCATGAAGTA 
TTGAACGACT CTGCTGTTTT AAACAGTTTA 
TGTAGCTTAA CTGCAGGCTT ACGCTTTTGA 
AAGTTTATTA TAATGTTGTA AATTTTTACT 
CACAATTAGG ATTCAGGAAA GAAAGAACTT 
AGCATTCAAT GGGTACTAAT TTCAAAGAAT 
TGATTTTCTA GGAATAATTG TATGAAGAAT 
AGCGGAGGCT GGACTAATGA ACACCCTACC 
CGTTTTGTAG GTAACGAGAA AATTGACTTG 
AAGGGGATGA TGGTGGAAGT TTTGTTCTGT 
TTTTGTGCAA CATAATAGTA GCAGTAAAAA 
AAGAGGGCAT CTGCTGGGAA ACGATTTG. 
■fiTAfl GA ACC CCA GAG CGA AAT AC. 
m Gly Thr Pro Glu Arg Asn Thr 
(§1@I@1@1<^115 



GTT TGC AAA AGA TGT CCA GAT GGG TTC TTC TCA AAT GAG ACG TCA TCT 
Val Cys_ T.y2.£ r il ny f " i ^ Prrf As P ? n y php pho j =:<ar <"-~i" T'hr fi^r Ser 
120t~'^^ 


2 5c§@i(§i@i@i@i(gi@i|i 3 (Cj@i@i@i@i<9i@i@i 


AAA GCA CCC TGT AGA AAA CAC ACA AAT TGC AGT GTC 
Lys fll pi Pr n p y? T^"! Hig Thr isn fys Spr Val' 


TTT GGT CTC CTG| 
Leu Leu 


Lys hi?) Vr n ^yr flrg T.y? -hie Thr isn rys sp.r Val. Pho Civ Leu 
( 4 0Cjg|QlQl©lQl@l@|^ 4 5 Cl^Pl@l@l@lQl@|pi 5 o 


CTA ACT CAG AAA GGA AAT GCA ACA CAC GAC AAC ATA TGT TCC GGA AAC 
Leu Thr Gl n Lys Gly Asn Ala Thr His Asp Asn lie Cvs Ser Gly Asn 

(||@l@l@i@i@)L 5 5 (^g|@l@igli@l(§l(§i^ 6 a^gpppiQigigpn. 6 5 


2429 
2489 
2549 
2609 
2669 
2729 
2789 
2849 
2909 
2969 
3029 
3089 
3149 
3209 
3269 
3329 
3389 
3449 
3509 
3569 
3629 
3689 
3749 
3809 
3869 
3929 
3989 
4049 
4109 
4169 
4229 
4289 
4349 
4409 
4469 
523 


571 


&619 


L667 


AGT GAA TCA ACT CAA AAA TGT GGA ATA 
Ser (Tin Spy Thr P.ln T.y^ Pyg r.ly He 

. 7 0d@l@l@l@l@l@i@i§l 1 5 


G GTAATTACAT TCCAAAATAC 



715 


GTCTTTGTAC. GATTTTGTAG TATCATCTCT CTCTCTGAGT TGAACACAAG GCCTCCAGCC 47 7 5 
ACATTCTTGG TCAAACTTAC ATTTTCCCTT TCTTGAATCT TAACCAGCTA AGGCTACTCT 48 35 


) 


CGATGCATTA 
AACACCTCAA 
AAGGGCAGTG 
CGTTGTGTGT 
TAAGAAGCAA 
ATAATCCCAA 
CCAGCCTGAC 
TGGTAGCAGG 
CAGGAGATGG 
AGCAAGATTT 
TGGCTTTGTT 
TGTGTTAAGC 
TTCCACGGTA 
CACTAGACTA 
TTGTGTTTAA 
TACAAAGAAG 
GTTCCAGCAT 
TCTTATCTAA 
TTTAACATTC 
TACTATGTGG 
TCAGATGAAT 
CAAAAACAAA 
GGGCTTTGTA 
GTCTACTTAT 
AATGTGGGCA 
ATAATTATTT 
TAGAATGTTA 
ATTTCACTCT 
ATTAGAAGAC 
TTTTATTCAA 
TTTGTTTTTG 
GJTTTQIAA6- 

i@i@l@l@r 


CTGCTAAAGC TACCACTCAG 
AGCTTGATTT TCTCTCCTTT 
TCAAGTTTGC CACTGAGATG 
TATTACTTTC ACGAATGTCT 
AGTGATATAA ACATGATGAC 
CATTTTGGGG GGCCAAGGTA 
CAACATGGTG AAACCTTGTC 
CACTTCTAGT ACCAGCTACT 
AGGTTGCAGT GAGCTGAGAT 
CATCACACAC ACACACACAC 
ACCTATGGTA TTAGTGCATC 
TCTTCATTGG GTACAGGTCA 
GTGATGACAA TTCATCAGGC 
ATCTCAGACC TTCACTCAAA 
TCAAGCAATG GTATAAACCA 
TTTATGAAGC AGAGAAATGT 
TGTTTCATTG TGTAATTGAA 
AAAAAAAAAA AAAAAAATGA 
TCTTTAATTA ATTCATTTTT 
TACTGTGCTA TAGAGGCTTT 
ATAGGTAGTA GAACGGCAGA 
CACCCATTAC TCCCATTTTC 
ATGCCTATGT AAATAACATA 
ATATCTGTAT CTATCTCTTG 
AAAAATAACA CACTATTCCA 
GTTTTGACAT TAATCATGAA 
ATGTTTGTAT TCATTATAAG 
AATTAGACAT TTACTAAACT 
ACGTAAGCTC AGTTGGTCTC 
ACTTTGCATT TTAGCATATT 
TTTGTATTGA ATAGACTCTC 
CTTTCTT IAG AT GTT ACC 


?Asp Val Thr 


AATCTCTCAA AAACTCATCT TCTCACAGAT 
CACACTGAAA TCAAATCTTG CCCATAGGCA 
AAATTAGGAG AGTCCAAACT GTAGAATTCA 
GTATTATTAA CTAAAGTATA TATTGGCAAC 
AAATTAGGCC AGGCATGGTG GCTTACTCCT 
GGCAGATCAC TTGAGGTCAG GATTTCAAGA 
TCTACTAAAA ATACAAAAAT TAGCTGGGCA 
CAGGGCTGAG GCAGGAGAAT CGCTTGAACC 
TGTACCACTG CACTCCAGTC TGGGCAACAG 
ACACACACAC ACACATTAGA AATGTGTACT 
TATTGCATGG AACTTCCAAG CTACTCTGGT 
CTAGTATTAA GTTCAGGTTA TTCGGATGCA 
TAGTGTGTGT GTTCACCTTG TCACTCCCAC 
GACACATTAC ACTAAAGATG ATTTGCTTTT 
GCTTGACTCT CCCCAAACAG TTTTTCGTAC 
GAATTGATAT ATATATGAGA TTCTAACCCA 
ATCATAGACA AGCCATTTTA GCCTTTGCTT 
AGGAAGGGGT ATTAAAAGGA GTGATCAAAT 
AATTTTACTT TTTTTCATTT ATTGTGCACT 
AACATTTATA AAAACACTGT GAAAGTTGCT 
ACTAGTATTC AAAGCCAGGT CTGATGAATC 
TGGGACATAC TTACTCTACC CAGATGCTCT 
GTTTTATGTT TGGTTATTTT CCTATGTAAT 
CTTTGTTTCC AAAGGTAAAC TATGTGTCTA 
AATTACTGTT CAAATTCCTT TAAGTCAGTG 
GTTCCCTGTG GGTACTAGGT AAACCTTTAA 
AATTTTTGGC TGTTACTTAT TTACAACAAT 
TTCTCTTGAA AACAATGCCC AAAAAAGAAC 
TGCCACTAAG ACCAGCCAAC AGAAGCTTGA 
TTATCTTGGA AAATTCAATT GTGTTGGTTT 
AGAAATCCAA TTGTTGAGTA AATCTTCTGf 
CTG TGT GAG GAG GCA TTC TTC AG(J 

To 'x n X>" n1 " n1 " flJLf. Pne Pne Ar 9" 
18qi@l@i@i@i@|@i@P> 185 


TTT GCT GTT CCT ACA AAG TTT ACG CCT AAC TGG CTT AGT GTC TTG GT. 
Phe^ila Val Pro Thr Lys Phe Thr S vo Asn -JEEfi Leu Ser Val Leu Val 
P|@P 190 Cj@Pl@l@l@l@l@l^L 9 5Ci@M@i@i@l@pl^2 0 0 


GAC AAT TTG CCT GGC ACC AAA GTA AAC GCA GAG AGT GTA GAG AGG ATA 
Asp^Asn JLeu Pro Gly Thr r.ys Vf^ &gn iy\ ^ H^UL^ 01 " Val Glu Arg lie 

!15 


Asn i,eu Pro Gly Thr Lys y agn ^itis 
205 (||p(gi@i(3i©Pigi9^ 1 o(Jp|@i@i@i@i@Pi^ 



AAA CGG CAA CAC AGC TCA CAA GAA CAG ACT TTC CAG CTG CTG AAG TTAi 
Lys Amjsi n His Sp r Ser Gi n . G .1, u Q] n Th r Phe^e ln Leu -feeu_Lys Leu 
220 (|@|@i@l@l@i@i@i^2 2 5Q|@i^i@|@M@M^ 3 0(j|@i@|^@i^@Pi^2 3 5 


4895 
4955 
5015 
5075 
5135 
5195 
5255 
5315 
5375 
5435 
5495 
5555 
5615 
5675 
5735 
5795 
5855 
5915 
5975 
6035 
6095 
6155 
6215 
6275 
6335 
6395 
6455 
6515 
6575 
6635 
.6695 
747 


795 


843 


76891 


TGG AAA CAT CAA AAC AAA GAC CAA GAT ATA GTC AAG AAG ATC ATC CAA G 6940 
Tcp Tiy c r Hi q fil n ft^n Lys^R5g~efav~ASP lie J/ aX-fcyg-Lys Ti e. lie Gin 

Pl@l@l@l@l^ 4 o(j©pi(aiei@i@i@r 



GTATGATAAT 
CAGGAACAAG 
GTTGGACTGA 
GGTTTTGTTC 
AAGAGAAATG 
GCTTCTGTAA 


CTAAAATAAA 
ACTGCATGTA 
AAAAGTTTCC 
TCACCCCTGC 
CATTTGAAGG 
GCAGCCCCTC 


AAGATCAATC 
TGTTTAGTTG 
ACCTGATAAT 
TCCCCAGTTT 
CAGGGCTGTA 
TAGACCACCA 


AGAAATCAAA 
TGTGGATCTT 
GTAGATGTGA 
CCTTGTAAAG 
TCTCAGGGAG 
AGGAGAAGCT 


GACACCTATT 
GTTTCCCTGT 
TTCCACAAAC 
TATGTTGAAC 
TCGCTTCCAG 
CTATAACCAC 


TATCATAAAC 7 000 
TGGAATCATT 7 060 
AGTTATACAA 7120 
ACTCTAAGAG 7180 
ATCCCTTAAC 7240 
TTTGTATCTT 7 300 


ACATTGCACC 
TTTTCGTAGC 
TTTTAATGGC 
TGTAGGAAAA 
CTCCTTTAGA 
TCTGAAGAAA 
AGTTCTGACT 
ATAATGCAGA 
TATAGTCTTG 
GAGACCAACG 
TTGAGCAAAC 
AAAATCAGAG 
TTTAACCCAG 
CATTAATAGA 
TTCTCTAGGC 
CCTCCTCATG 
TTAACTTATC 
ACTTTCCTTT 
GAGAGTGATG 
TTTGTTGGAT 
AGTTCTTAGA 
ATATAATTAT 
ATGAATAIAA 


TCTACCAAGA AGCTCTGTTG TATTTACTTG GTAATTCTCT CCAGGTAGGC 
TTACAAATAT GTTCTTATTA ATCCTCATGA TATGGCCTGC ATTAAAATTA 
ATATGTTATG AGAATTAATG AGATAAAATC TGAAAAGTGT TTGAGCCTCT 
AGCTAGTTAC AGCAAAATGT TCTCACATCT TATAAGTTTA TATAAAGATT 
AATGGTGTGA GAGAGAAACA GAGAGAGATA GGGAGAGAAG TGTGAAAGAA 
AGGAGTTTCA TCCAGTGTGG ACTGTAAGCT TTACGACACA TGATGGAAAG 
TCAGTAAGCA TTGGGAGGAC ATGCTAGAAG AAAAAGGAAG AAGAGTTTCC 
CAGGGTCAGT GAGAAATTCA TTCAGGTCCT CACCAGTAGT TAAATGACTG 
CACTACCCTA AAAAACTTCA AGTATCTGAA ACCGGGGCAA CAGATTTTAG 
TCTTTGAGAG CTGATTGCTT TTGCTTATGC AAAGAGTAAA CTTTTATGTT 
CAAAAGTATT CTTTGAACGT ATAATTAGCC CTGAAGCCGA AAGAAAAGAG 
ACCGTTAGAA TTGGAAGCAA CCAAATTCCC TATTTTATAA ATGAGGACAT 
AAAGATGAAC CGATTTGGCT TAGGGCTCAC AGATACTAAG TGACTCATGT 
AATGTTAGTT CCTCCCTCTT AGGTTTGTAC CCTAGCTTAT TACTGAAATA 
TGTGTGTCTC CTTTAGTTCC TCGACCTCAT GTCTTTGAGT TTTCAGATAT 
GAGGTAGTCC TCTGGTGCTA TGTGTATTCT TTAAAGGCTA GTTACGGCAA 
AACTAGCGCC TACTAATGAA ACTTTGTATT ACAAAGTAGC TAACTTGAAT 
TTTTCTGAAA TGTTATGGTG GTAATTTCTC AAACTTTTTC TTAGAAAACT 
TGTCTTATTT TCTACTGTTA ATTTTCAAAA TTAGGAGCTT CTTCCAAAGT 
GCCAAAAATA TATAGCATAT TATCTTATTA TAACAAAAAA TATTTATCTC 
AATAAATGGT GTCACTTAAC TCCCTCTCAA AAGAAAAGGT TATCATTGAA 
GAAATTCTGC AAGAACCTTT TGCCTCACGC TTGTTTTATG ATGGCATT' 

ATHATHT HAA PACTTA 1 ^ T^ nnPTTTTHf^ AT ATT GAC 

t@l@l<aM^ He Asp 



CTC TGT GAA AAC AGC GTG CAG CGG CAC ATT GGA CAT GCT AAC CTC AO 
Leu .Cvs^ Glu Asn Ser Val nin ftr-g ric; t1p> nlv^KjR Ala Asn f.eu Thr 


255 



mmm 


m 


er Val nin a 



TTC GAG CAG CTT CGT AGC TTG ATG GAA AGC TTA CCG GGA AAG AAA GTG( 
Phfi G lu Gin Leu A rg Ser Leg Mpf g 1 U^ Spr- y J^_P r 2 ffZ LY^-J^Y" Val 


|p75 



GGA GCA GAA GAC ATT GAA AAA ACA ATA AAG GCA TGC AAA CCC AGT GAC 
^ly^ frJ 3 n L P AH P Ti e r!1n T, Y R Th r Tip L yr ; Al a . TyB Lys P ro Ser Asp 
@» 9 0Q|^l©l@l@i@l@|^2 9 5 (^l@l@l@l@l@l@p ? 3 0 0 




CAG ATC CTG AAG CTG CTC AGT TTG TGG CGA ATA AAA AAT GGC GAC CAA/ 
GlU JElfi-Leu T-Vfj T.on r.cn fip r T.on /Vrp &rq T"|<=» T,yg isn Gly Asp Gin 


LLfiLeu T-Vfj r°" ^°T" 1 2 ^ 'Y" 

|@P>05 (| §pi@l(a|(§l@|(§^ 3 1 og pPl@PlQlQl<§| |^ 1 5 

GAC ACC TTG AAG GGC CTA ATG CAC GCA CTA AAG CAC TCA AAG ACG TAC| 
Asg^Thr Le u Lys Gly Leu Me>+ Ala T.^ii TiYjS Hiis Ser Lys Thr Tyr 

1320 (j(§i@l@l@lQl@lQi^3 2 5^igi@iQl@l(alQiQj^3 3 0 


CAC TTT CCC AAA ACT GTC ACT CAG AGT CTA AAG AAG ACC ATC AGG TTC| 
His P he Pro Lys Thr Val T hr Gin Ser Leu Lys L ys T hr Tl p Arg Phe 
3 351 


m 


CTT CAC AGC TTC ACA ATG TAC AAA TTG TAT CAG AAG TTA TTT TTA GAA | 
Le /£ 5 e .r p fcg_T hr ^M^L-Jy 1 " LY1 LfflUj'yT" Gl£_J^s-4^j__£he_Leu Glu 


jeu Hi s Ser P he Thr M^tjyr Lys? LffliTyr GlnLiis— J^u_£he Le 

5 5(P|@i@l@l<ai@i^3 6 0 (jgji|@lQi@liPj^3 6 


ATG ATA GGT AAC CAG GTC CAA TCA GTA AAA ATA AGC TGC TTA (J@i@i@I@| 
Met l ie Gly Asn n ip vpi Gin Ser Val/y t 1r n Y5 To " 


7360 
7420 
7480 
7540 
7600 
7660 
7720 
7780 
7840 
7900 
7960 
8020 
8080 
8140 
8200 
8260 
8320 
8380 
8440 
8500 
8560 
8620 
676 


724 


J772 


820 


¥8868 


&916 


1964 


1012 


>054 


TAACTGGAAA TGGCCATTGA GCTGTTTCCT CACAATTGGC GAGATCCCAT GGATGAGTAA 9114 
ACTGTTTCTC AGGCACTTGA GGCTTTCAGT GATATCTTTC TCATTACCAG TGACTAATTT 9174 
TGCCACAGGG TACTAAAAGA AACTATGATG TGGAGAAAGG ACTAACATCT CCTCCAATAA 9234 
ACCCCAAATG GTTAATCCAA CTGTCAGATC TGGATCGTTA TCTACTGACT ATATTTTCCC 9294 
TTATTACTGC TTGCAGTAAT TCAACTGGAA ATTAAAAAAA AAAAACTAGA CTCCACTGGG 9 354 
CCTTACTAAA TATGGGAATG TCTAACTTAA ATAGCTTTGG GATTCCAGCT ATGCTAGAGG 9414 
CTTTTATTAG AAAGCCATAT TTTTTTCTGT AAAAGTTACT AATATATCTG TAACACTATT 9474 
ACAGTATTGC TATTTATATT CATTCAGATA TAAGATTTGG ACATATTATC ATCCTATAAA 95 34 
. GAAACGGT AT GACTTAATTT TAGAAAGAAA ATTATATTCT GTTTATTATG ACAAATGAAA 9594 
GAGAAAATAT ATATTTTTAA TGGAAAGTTT GTAGCATTTT TCTAATAGGT ACTGCCATAT 9654 
TTTTCTGTGT GGAGTATTTT TATAATTTTA TCTGTATAAG CTGTAATATC ATTTTATAGA 9714 
AAATGCATTA TTTAGTCAAT TGTTTAATGT TGGAAAACAT ATGAAATATA AATTATCTGA 9774 
ATATTAGATG CTCTGAGAAA TTGAATGTAC CTTATTTAAA AGATTTTATG GTTTTATAAC 98 34 
TATATAA&TG ACATTATTAA AGTTTTCAAA TTATTTTTTA TTGCTTTCTC TGTTGCTTTT 9894 
ATTti@i@l@> 9898 



number : J ~ 


{f))&engtj^ of so 


«ftee: 401 

^1 Soqu once <$tfyf : amino acid 
/A (StraHdednes^: single stranded 
[Pj$ ^5pl oq y : linea r 
(jtf ^olecul^ typ g?: protein 


Aet Asn 

i(3 -20 



eu Asp lie Ser 
10 


Lie Lys Trp Thr* Thr ''gEBLGljj Th r Pne Pro Pro Lys Tyr Leu His 



1-5 

Asp nin 
my T ftr Tvr Leu Ly 



Hi s Gin Leu Lg 


n iu Hi" -fyg 



(yg Aia Prn nya Pr 
i(frrh> H&r kap 1-iin (ys Lej 

5g j<§|Qi@i@lQl©i ^ 6 



Tyr cys, 

51n Tvr v al Lys G in rcln rysi~Agn 


Cys Asp Lys Cys Pro 
20 

Ala Lys Trp Lys Thr 
35 

Tyr Thr Asp Ser Trp His 
50 

Pro Val Cys Lys Glu Leu 
65 

Thr His Asn Arg Val Cys 
80 

lie Glu Phe Cys Leu Lys 
95 


7(%i@i§g@i«> 7 ^j@i@i@i<ai@i@i 

31u Cys Lys GjiT Gly Arg Tyr Leu gI; 
3!^|g|g |@MQ§i§§ > 9C1@|@|@|@|@|@L 

!pis ^Arg Ser^Cya^Pro ProGly Phe gI\Lval Val Gin Ala Gly Thr 

mmmm& 105 (|si@i@i@isi^> no 

[lu Arg Askn Thr Val Cys Lys Arja Cys Pro Asp Gly Phe Phe 

!!@!@|@!@p i2o<|@!@pi@i&~> 125 

.sn uiu I'Jjx Ser Ser ^ys~ ATaPro Cys Arg Lys His Thr Asn 
H@|@|@iP 135<lg|@|@|@|@||^ 140 

er val Phe Gly Leu ieu Leu Tnr/Gln Lys Gly Asn Ala Thr 
ii@|@|@|@|f) 150 C i@lQlQl@lQ|C ) 155 
sp Asn Ile_cys Ser Gly Asn aex^Glu Ser Thr Gin Lys Cys 
i@Pi@i@i@i£) 1 6 5 (MM!@!@i@ 
le Asp Va _ 


170 


@Val fc.ro Tnr 


jrhr Leu Cys tiiu 
180 

Phe Thr Pro Asn T 



90 <|@§@ii||i@i£) 195 
sn LeujPToGiSlThr Lys Val Asn Al, 
205 t@|§|@|@igi£) 210 
Lys Arg Gin His Ser Ser Gin Glu Gin Thr Phe Gin Leu Leu Lys 


Ala Phe Phe Arg Phe Ala 
185 

Leu Ser Val Leu Val Asp 
200 

£lu Ser Val Glu Arg lie 
215 




225 
In Asn 
j|i@§@[@^ 24 
Asp lie Asp Leu 

j@Pl@lC > 255 

is Ala Asn_Leu Thr 
®|@|@|@|g) 270 
Leu Pro G±y\_Lys Lys 
i280<^@|@|@ii@l<^i7 28 
@Ile 
j<§295( 
!<§Leu 
!<§310< 
;<§Met 
!€ 325( 
Kval 
IC 3 4 0 < 
«Thr 
j<?355< 
i@Asn 
!<§370< 


n Lys Asp Gin, 

oCi@l@l@l@i@l 


^sp 



;ys Glu Asn 
|@M@|@M@ 
The ulu ulp-^Le 


0 


ys Ala Cys Lys Pro 
30 

ileTLys Asn 


Tr 


rSrc 

mm 


lis Ala Leu Lys 
Fhr (Jin tjer Leu 
^t Tyr Lys Leu 


315 
His 
330 
Lys 
345 
Tyr 
360 


230 

lie Val Lys Lys lie 
245 

er Val Gin Arg His lie 
260 

eu Arg Ser Leu Met Glu 

„_ 275 

yai Gly AlauSlu Asp He Glu Lys Thr 

290 

er Asp Gln_Jle Leu Lys Leu Leu Ser 
305 

ly Asp Gl^Asp Thr Leu Lys Gly Leu 

@|@|@P|P@> 320 

er Lys Thr Tyr His Phe Pro Lys Thr 

^fi@|@|@H> 335 
ys Thr Tle^Arg Phe Leu His Ser Phe 
350 


h Lys LeuPhe Leu Glu Met He Gly 
"^|@|@|@|S> 365 
n val Gin_J5er Val Lys^lle §§r Cys Leu 

" 380 


hgW 


1206 9( SGiue^us Ct4A&te7&VSf7C 5 I 


cb number: 


TGTGACAAAT 
GTGTGCGCCC 


Stoquonce (type) nucleic aci d 
(Q Strandedness : single ( 9tran dedJ> 
(Q) Topologyx_-Llnear ' 

]iO(^o]^cuife^tY£^> cDNA j 

ATGAACAACT TGCTGTGCTG CGCGCTCGTG TTTCTGGACA TCTCCATTAA GTGGACCACC 
CAGGAAACGT TTCCTCCAAA GTACCTTCAT TATGACGAAG AAACCTCTCA TCAGCTGTTG 
GTCCTCCTGG TACCTACCTA AAACAACACT GTACAGCAAA GTGGAAGAGC 
CTTGCCCTGA CCACTACTAC ACAGACAGCT GGCACACCAG TGACGAGTGT 
CTATACTGCA GCCCCGTGTG CAAGGAGCTG CAGTACGTCA AGCAGGAGTG CAATCGCACC 
CACAACCGCG TGTGCGAATG CAAGGAAGGG CGCTACCTTG AGATAGAGTT CTGCTTGAAA 
CATAGGAGCT GGCCTCCTGG ATTTGGAGTG GTGCAAGCTG GAACCCCAGA GCGAAATACA 
GTTTGCAAAA GATGTCCAGA TGGGTTCTTC TCAAATGAGA CGTCATCTAA AGCACCCTGT 
AGAAAACACA CAAATTGCAG TGTCTTTGGT CTCCTGCTAA CTCAGAAAGG AAATGCAACA 
CACGACAACA TATGTTCCGG AAACAGTGAA TCAACTCAAA AATGTGGAAT AGATGTTACC 
CTGTGTGAGG AGGCATTCTT CAGGTTTGCT GTTCCTACAA AGTTTACGCC TAACTGGCTT 
AGTGTCTTGG TAGACAATTT GCCTGGCACC AAAGTAAACG CAGAGAGTGT AGAGAGGATA 
AAACGGCAAC ACAGCTCACA AGAACAGACT TTCCAGCTGC TGAAGTTATG GAAACATCAA 
AACAAAGACC AAGATATAGT CAAGAAGATC ATCGAAGATA TTGACCTCTG TGAAAACAGC 
GTGCAGCGGC ACATTGGACA TGCTAACCTC ACCTTCGAGC AGCTTCGTAG CTTGATGGAA 
AGCTTACCGG GAAAGAAAGT GGGAGCAGAA GACATTGAAA AAACAATAAA 
CCCAGTGACC AGATCCTGAA GCTGCTCAGT TTGTGGCGAA TAAAAAATGG 
ACCTTGAAGG GCCTAATGCA CGCACTAAAG CACTCAAAGA CGTACCACTT 
GTCACTCAGA GTCTAAAGAA GACCATCAGG TTCCTTCACA GCTTCACAAT 
TA?ChGApJZ2~TK^^ AACCAGGT ^ flftTPft^TiWi A ATi^i-A^PT^ 

TTATA^@l@l@|(aM 


GGCATGCAAA 
CGACCAAGAC 
TCCCAAAACT 
GTACAAATTG 


60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
106 
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(3) Computer: Apple Macintosh; 

(i) Operating System: Macintosh; 

(ii) Macintosh File Type: text with line 
termination 

(iii) line Terminator Pre-defined by 
text type file; 

(iv) Pagination: Pre-defined by text 
type file; 

(v) End-of-file: Pre-defined by text 
type file; 

(vi) Media: (A) Diskett — 3.50 inch. 400 
Kb storajze* 

(B) Diskette— 3.50 inch, 800 Kb 
storage; 

(C) Diskette— 3.50 inch, 1.4 Mb 
st rage; 

(vii) Print Command: Use PRINT , 
command from any Macintosh 
Application that processes text files* 
such as MacWrite or Teach Text; 

(4) Magnetic tape: 0.5 inch, up to 2400 
feet; 

(ij Density: 1600 or 6250 bits per inch, 
9 track; 

(li) Format: raw, unblocked; . 

(ill) Line Terminator ASCII Carriage 
Return plus 'optional ASCII Line Feed; 

(iv) Pagination: ASCII Form Feed or 
Series of Line Terminators; 

(v) Print Command (Unix shell version 
given here as sample response — mt/ 
dev/rmtO; lpr/dev/rmtO): 

(g) Computer readable forms that are 
submitted to the Office will not be 
returned to the applicant. 

(h) All computer readable forms shall 
have a label^e^aiiently affixed thereto 
oh >^ich has teen hafitf printed or 
typed, a description of the format of the 
computer readable form as well as the 
name of the applicant the title of the 
invention, the date on which the data 
were recorded on the computer readable 
form and the name and type of computer 
and operating system which generated 
the files on the computer readable form. 
If all of this information cannot be 
printed on a label affixed to the 
computer readable form, by reason of 
size or otherwise, the label shall include 
the name of the applicant and the title of 
the invention and a reference number, 
and the additional information may be 
provided on a container for the 
computer readable fonff with the name 

Hhe applicant, the title of the 
invention, the reference number and the 
additional information affixed to the 
container. If the computer readable form 
is submitted after the date f filing 


under 35 U.S.C. 111, after the date of 
entry in the national stage und r 35 
U.S.C 371 or after the time of filing; in 
the United States Receiving Office, an 
international application under the PCT, 
the labels mentioned herein must also 
include the date of the application and 
the application number, including series 
code and serial number. 

S 1.625 Amendments to or replacement of 
sequence Hsting end computer readable 
copy uiotsot. 

(a) Any amendment to the paper copy 
of the "Sequence Listing" (5 1.821(c)) 
must be made by the submission of 
substitute sheets. Amendments must be 
accompanied by a statement that 
indicates support for the amendment in 
the application, as. filed, and a statement 
that the substitute sheets include no 
new matter. Such a statement must be. a 
verified statement if made by a person 
not registered to practice before the 
Office. 

(b) Any amendment to the paper copy 
of the "Sequence listing," in accordance 
with paragraph (a) of this section, must 
be accompanied by a substitute copy of 
the computer readable form (§ 1.621(e)) 
including all previously submitted data 
with the amendment incorporated 
therein, accompanied by a statement '. \ 
that the copy in computer readable form 
is the same as the substitute copy of the 
"Sequence Listing." Such a statement 
must be a verified statement if made by 
a person not registered to practice 
before the Office. 

. (c) Any appropriate amendments to 
the "Sequence Listing" in a patent e.g. 
by reason of reissue or certificate of 
correction* must comply with the 
requirements of paragraphs (a) and (b) 
of this section. 

(d) If, upon receipt, the computer 
readable form is found to be damaged or 
unreadable, applicant must provide, 
within such time as set by the 
• Commissioner, a substitute copy of the 
data In computer readable form 
accompanied by a statement that the 
substitute data is identical to that 
originally filed. Such a statement must, 
be a verified statement if made by a 
person not registered to practice before 

" ^rn, : ■ ' 

( Appendix A — Sample Sequence listing 
(1) GENERAL INFORMATION: 


(I) APPLICANT: Doe, Joan X, Doe, John Q 
(ii) TITLE OF INVENTION: Isolation and 
Characterization of a- Gene Encoding a 
Protease from Paramecium sp. 
(iU) NUMBER OF SEQUENCES: 2 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Smith and Jones 

(B) STREET: 123 Main Street 

(C) CITY: Smalltown 

(D) STATE: Anystate 

(E) COUNTRY: USA - 

(F) ZIP: 12345 , 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 600 
Kb stora ge 

(B) COMPUTER: Apple Macintosh 

(C) OPERATING SYSTEM: Mcintosh 5.0 

(D) SOFTWARE: MacWrite 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 09/999,999 

(B) FtUNG DATE: 28-FEB-l 989 

(C) CLASSIFICATION: 999/99 

(vii) PRIOR APPUCATION DATA: 

(A) APPLICATION NUMBER: PCT/US88/ 



(B) FILING DATE: 01-MAR-1988 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Smith, John A. 

(B) REGISTRATION NUMBER: 00001 

(C) REFERENCE/DOCKET NUMBER: 01- 
0001 

(ix) TELECOMMUNICATION 
INFORMATION: 

(A) TELEPHONE: (909) 999-0001 

(B) TELEFAX: (009) 999-0002 

(2) INFORMATION FOR SEQ ID NO: l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 954 base pairs 

(B) TYPE: nucleic acid . 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: ho 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Paramecium sp 
(C) INDIVIDUAL/ISOLATE: XYZ2 
(G) CELL TYPE: unicellular organism 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: genomic 

(B) CLONE: Para-XYZ2/36 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Doe, Joan X. Doe, John Q 

(B) TITLE: Isolation and Characterization 
of a Gene Encoding a-Pro tease from 
Paramecium sp. 

(C) JOURNAL: Fictional Genes 

(D) VOLUME: I 

(E) ISSUE: 1 

(F) PAGES: 1-20 

. (G) DATE: 02-MAR-198S 
flC) RELEVANT RESIDUES IN SEQ ID NO: 
1: FROM 1 TO 954 

BILLING CODE 3S10-1S-M 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 

ATCGGGATAG TACTGGTCAA GACCGGTGGA CACCGGTTAA CCCCGGTTAA GTACCGGTTA 60 

TAGGCCATTT CAGGCCAAAT GTGCCCAACT ACGCCAATTG TTTTGCCAAC GGCCAACGTT 120 

ACGTTCGTAC GCACGTATGT ACCTAGGTAC TTACGGACGT GACTACGGAC ACTTCCGTAC 180 

GTACGTACGT TTACGTACCC ATCCCAACGT AACCACAGTG TGGTCGCAGT GTCCCAGTGT . 240 

ACACAGACTG CCAGACATTC TTCACAGACA CCCC ATG ACA CCA CCT GAA CGT CTC 295 

Met Thr Pro Pro Glu Arg Leu 
-30 

TTC CTC CCA AGG GTG TGT GGC ACC ACC CTA CAC CTC CTC CTT CTG GGG 343 
Phe Leu Pro Arg Val Cys Gly Thr Thr Leu His Leu Leu Leu Leu Gly 
-25 -20 -15 

CTG CTG CTG GTT CTG CTG CCT GGG GCC CAT GTGAGGCAGC AGGAGAATGG 393 
Leu Leu Leu Val Leu Leu Pro Gly Ala His 
-10 -5 

GGTGGCTCAG CCAAACCTTG AGCCCTAGAG CCCCCCTCAA CTCTGTTCTC CTAG GGG 450 

Gly 

CTC ATG CAT CTT GCC CAC AGC AAC CTC AAA CCT GCT GCT CAC CTC ATT 498 
Leu Met His Leu Ala His Ser Asn Leu Lys Pro Ala Ala His Leu He 
15 10 15 

GTAAACATCC ACCTGACCTC CCAGACATGT CCCCACCAGC TCTCCTCCTA CCCCTGCCTC 558 

AGGAACCCAA GCATCCACCC CTCTCCCCCA ACTTCCCCCA CGCTAAAAAA AACAGAGGGA 618 

GCCCACTCCT ATGCCTCCCC CTGCCATCCC CCAGGAACTC AGTTGTTCAG TGCCCACTTC 678 

TAC CCC AGC AAG CAG AAC TCA CTG CTC TGG AGA GCA AAC ACG GAC CGT 726 
Tyr Pro Ser Lys Gin Asn Ser Leu Leu Trp Arg Ala Asn Thr Asp Arg 
20 25 30 

GCC TTC CTC CAG GAT GGT TTC TCC TTG AGC AAC AAT TCT CTC CTG GTC 774 
Ala Phe Leu Gin Asp Gly Phe Ser Leu Ser Asn Asn Ser Leu Leu Val 

35 40 45 _ 

TAGAAAAAAT AATTGATTTC AAGACCTTCT CCCCATTCTG CCTCCATTCT GACCATTTCA 834 

GGGGTCGTCA CCACCTCTCC TTTGGCCATT CCAACAGCTC AAGTCTTCCC TGATCAAGTC 894 

ACCGGAGCTT TCAAAGAAGG AATTCTAGGC ATCCCAGGGG ACCCACACCT CCCTGAACCA 954 

BILLING COOE 3510-16-C 
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(2) INFORMATION F OR SE Q ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(ix) FEATURE: 

(A) NAME/KEY: signal sequence 

(B) LOCATION: -34 to -1 


(C) IDENTIFICATION METHOD: similarity 
to other signal sequences, hydrophobic 

(D) OTHER INFORMATION: expresses 
protease 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Doe. Joan X. Doe. John Q 

(B) TITLE: Isolation and Characterization 
of a Gene Encoding a Protease from 
Paramecium sp. ' 


(C) JOURNAL: Fictional Genes 

(D) VOLUME: I : 

(E) ISSUE: 1 

(F) PAGES: 1-20 

(G) DATE: 02-MAR-1988 . 

(K) RELEVANT RESIDUES IN SEQ ID NO: 
2: FROM -34 TO 46 
HLLMQ COOC SS10-1t-4l 


h 


C4- 


dcKfo^ Of 1 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 


Met Thr Pro Pro Glu Arg Leu Phe Leu Pro Arg Val Cys Gly Thr Thr 
-30 -25 -20 

Leu His Leu Leu Leu Leu Gly Leu Leu Leu Val Leu Leu Pro Gly Ala 
-15 -10 -5 

His Gly Leu Met His Leu Ala His Ser Asn Leu Lys Pro Ala Ala His 
1 • 5 10 

Leu lie Tyr Pro Ser Lys Gin Asn Ser Leu Leu Trp Arg Ala Asn Thr 
15 20 25 30 

Asp Arg Ala Phe Leu Gin Asp Gly Phe Ser Leu Ser Asn Asn Ser Leu 
35 40 45 

Leu Val 

BILLING CODE J510-16-C 


